Search Results: "Joachim Breitner"

9 May 2016

Joachim Breitner: Doctoral Thesis Published

I have officially published my doctoral thesis Lazy Evaluation: From natural semantics to a machine-checked compiler transformation (DOI: 10.5445/IR/1000054251). The abstract of the 226 page long document that earned me a summa cum laude reads
In order to solve a long-standing problem with list fusion, a new compiler transformation, 'Call Arity' is developed and implemented in the Haskell compiler GHC. It is formally proven to not degrade program performance; the proof is machine-checked using the interactive theorem prover Isabelle. To that end, a formalization of Launchbury s Natural Semantics for Lazy Evaluation is modelled in Isabelle, including a correctness and adequacy proof.
and I assembled all relevant artefacts (the thesis itself, its LaTeX-sources, the Isabelle theories, various patches against GHC, raw benchmark results, errata etc.) at http://www.joachim-breitner.de/thesis/. Other, less retrospective news: My paper on the Incredible Proof Machine got accepted at ITP in Nancy, and I was invited to give a keynote demo about the proof machine at LFMTP in Porto. Exciting!

17 February 2016

Joachim Breitner: Free Paradoxes

Maybe it works if he is a freelancer?

Maybe it works if he is a freelancer?

9 February 2016

Joachim Breitner: GHC performance is rather stable

Johannes Bechberger, while working on his Bachelor s thesis supervised by my colleague Andreas Zwinkau, has developed a performance benchmark runner and results visualizer called temci , and used GHC as a guinea pig. You can read his elaborate analysis on his blog. This is particularly interesting given recent discussions about GHC itself becoming slower and slower, as for example observed by Johannes Waldmann and Anthony Cowley. Johannes Bechberger s take-away is that, at least for the programs at hand (which were taken from the The Computer Language Benchmarks Game, there are hardly any changes worth mentioning, as most of the observed effects are less than a standard deviation and hence insignificant. He tries hard to distill some useful conclusions from the data; the one he finds are: If you are interested, please head over to Johannes s post and look at the gory details of the analysis and give him feedback on that. Also, maybe his tool temci is something you want to try out? Personally, I find it dissatisfying to learn so little from so much work, but as he writes: It s so easy to lie with statistics. , and I might add lie to yourself , e.g. by ignoring good advise about standard deviations and significance. I m sure my tool gipeda (which powers perf.haskell.org) is guilty of that sin. Maybe a different selection of test programs would yield more insight; the benchmark s games programs are too small and hand-optimized, the nofib programs are plain old and the fibon collection has bitrotted. I would love to see a curated, collection of real-world programs, bundled with all dependencies and frozen to allow meaningful comparisons, but updated to a new, clearly marked revision, on a maybe bi-yearly basis maybe Haskell-SPEC-2016 if that were not a trademark infringement.

8 February 2016

Joachim Breitner: Protecting static content with mod_rewrite

Since fourteen years, I have been photographing digitally and putting the pictures on my webpage. Back then, online privacy was not a big deal, but things have changed, and I had to at least mildly protect the innocent. In particular, I wanted to prevent search engines from accessing some of my pictures. As I did not want my friends and family having to create an account and remember a password, I set up an OpenID based scheme five years ago. This way, they could use any of their OpenID enabled account, e.g. their Google Mail account, to log in, without disclosing any data to me. As my photo album consists of just static files, I created two copies on the server: the real one with everything, and a bunch of symbolic links representing the publicly visible parts. I then used mod_auth_openid to prevent access to the protected files, unless the users logged in. I never got around of actually limiting who could log in, so strangers were still able to see all photos, but at least search engine spiders were locked out. But, very unfortunately, OpenID did never really catch on, Google even stopped being a provider, and other promising decentralized authentication schemes like Mozilla Persona are also phased out. So I needed an alternative. A very simply scheme would be a single password that my friends and family can get from me and that unlocks the pictures. I could have done that using HTTP Auth, but that is not very user-friendly, and the login does not persist (at least not without the help of the browser). Instead, I wanted something that involves a simple HTTP form. But I also wanted to avoid server-side programming, for performance and security reasons. I love serving static files whenever it is feasible. Then I found that mod_rewrite, Apache s all-around-tool for URL rewriting and request mangling, supports reading and writing cookies! So I came up with a scheme that implements the whole login logic in the Apache server configuration. I d like to describe this setup here, in case someone finds it inspiring. I created a login.html with a simple HTML form:
<form method="GET" action="/bilder/login.html">
 <div style="text-align:center">
  <input name="password" placeholder="Password" />
  <button type="submit">Sign-In</button>
 </div>
</form>
It sends the user to the same page again, putting the password into the query string, hence the method="GET" mod_rewrite unfortunately cannot read the parameters of a POST request. The Apache configuration is as follows:
RewriteMap public "dbm:/var/www/joachim-breitner.de/bilder/publicfiles.dbm"
<Directory /var/www/joachim-breitner.de/bilder>
 RewriteEngine On
 # This is a GET request, trying to set a password.
 RewriteCond % QUERY_STRING  password=correcthorsebatterystaple
 RewriteRule ^login.html /bilder/loggedin.html [L,R,QSD,CO=bilderhp:correcthorsebatterystaple:www.joachim-breitner.de:2000000:/bilder]
 # This is a GET request, trying to set a wrong password.
 RewriteCond % QUERY_STRING  password=
 RewriteRule ^login.html /bilder/notloggedin.html [L,R,QSD]
 # No point in loggin in if there is already the right password
 RewriteCond % HTTP:Cookie  bilderhp=correcthorsebatterystaple
 RewriteRule ^login.html /bilder/loggedin.html [L,R]
 # If protected file is requested, check for cookie.
 # If no cookie present, redirect pictures to replacement picture
 RewriteCond % HTTP:Cookie  !bilderhp=correcthorsebatterystaple
 RewriteCond $ public:$0 private  private
 RewriteRule ^.*\.(png jpg)$ /bilder/pleaselogin.png [L]
 RewriteCond % HTTP:Cookie  !bilderhp=correcthorsebatterystaple
 RewriteCond $ public:$0 private  private
 RewriteRule ^.+$ /bilder/login.html [L,R]
</Directory>
The publicfiles.dbm file is generated from a text file with lines like
login.html.en 1
login.html.de 1
pleaselogin.png 1
thumbs/20030920165701_thumb.jpg 1
thumbs/20080813225123_thumb.jpg 1
...
using
/usr/sbin/httxt2dbm -i publicfiles.txt -o publicfiles.dbm
and whitelists all files that are visible without login. Make sure it contains the login page, otherwise you ll get a redirect circle. The other directives in the above configure fulfill these tasks: And that s it! No resource-hogging web frameworks, not security-dubious scripting languages, and a dead-simple way to authenticate. Oh, and if you believe you know me well enough to be allowed to see all photos: The real password is not correcthorsebatterystaple; just ask me what it is.

28 January 2016

Joachim Breitner: Dreaming of role playing

Recently, at a summer-school-like event, we were discussing pen-and-paper role playing. I m not sure if this was after a session of role-playing, but I was making the point that you don t need much or any at all of the rules, and scores, and dice, if you are one of the story-telling role players, and it can actually be more fun this way. As an example, I said, it can make sense if one of the players (and the game master, I suppose) reads up a lot about one aspect of the fantasy world, e.g. one geographical area, one cult, one person, and then this knowledge is used to create an exciting puzzle, even without any opponents. I m not quite sure, but I think I fell asleep shortly after, and I dreamed of such a role playing session. It was going roughly like this:
I (a human), and my fellows (at least a dwarf, not sure about the rest) went to some castle. It was empty, but scary. We crossed its hall, and went into a room on the other side. It was locked towards the hall by a door that covered the door frame only partly, and suddenly we could see a large Ogre, together with other foul folk not worth mentioning, hammered at the door. My group (which was a bit larger in that moment) all prepared shooting arrows at him the moment it burst through the door. I had the time to appreciate the ingenuity that we all waited for him to burst through, so that none of the arrows would bounce of the door, but it did not help, and we ran from the castle, over a field, through a forest, at the other side of which we could see, below a sleep slope, a house, so we went there. The path towards that was filled with tracks that looked surprisingly like car tracks. When we reached the spot there was no house any more, but rather a cold camp side. We saw digging tools, and helmets (strangely, baseball helmets) were arranged in a circle, as if it was a burial site. We set up camp there and slept. It occurred to me that I must have been the rightful owner of the castle, and it was taken by me from my brother and his wife, who denied my existence or something treacherously like that. When we woke up at the camp side, she were there, together with what must be my niece. My sister in law mocked us for fighting unsuccessfully at the castle, but my niece was surprised to see me, as I must have a very similar appearance to my brother. She said that her mother forbid it, but she nevertheless sneakily takes out something which looks like a Gameboy with a camera attachment and a CompactFlash card from her mothers purse, puts it in and take a photo of me. This is when I realize that I will get my castle back.
At that moment, I woke up. I somewhat liked the story (and it was a bit more coherent in my mind than what I then wrote down here), so I wanted to write it down. I quickly fetched my laptop. My friends at the summer school were a bit worried, and I promised not to mention their names and concrete places, and started writing. They distracted me, so I searched for a place of my own, lied down (why? no idea), and continued writing. I had to to touch writing on my belly, because my laptop was not actually there. I also noticed that I am back at the camp side, and that I am still wearing my back protector that I must have been wearing while fighting in the castle, and which I did not take off while sleeping at the camp side. Funnily, it was not a proper medieval amour, but rather my snowboarding back protector.
At that moment, I woke up. I somewhat liked the story (and it was a bit more coherent in my mind than what I then wrote down here), so I wanted to write it down. I quickly got up, started my laptop, and wrote it down. And this is what you are reading right now. Off to bed again, let s see what happens next.

20 December 2015

Lunar: Reproducible builds: week 34 in Stretch cycle

What happened in the reproducible builds effort between December 13th to December 19th: Infrastructure Niels Thykier started implementing support for .buildinfo files in dak. A very preliminary commit was made by Ansgar Burchardt to prevent .buildinfo files from being removed from the upload queue. Toolchain fixes Mattia Rizzolo rebased our experimental debhelper with the changes from the latest upload. New fixes have been merged by OCaml upstream. Packages fixed The following 39 packages have become reproducible due to changes in their build dependencies: apache-mime4j, avahi-sharp, blam, bless, cecil-flowanalysis, cecil, coco-cs, cowbell, cppformat, dbus-sharp-glib, dbus-sharp, gdcm, gnome-keyring-sharp, gudev-sharp-1.0, jackson-annotations, jackson-core, jboss-classfilewriter, jboss-jdeparser2, jetty8, json-spirit, lat, leveldb-sharp, libdecentxml-java, libjavaewah-java, libkarma, mono.reflection, monobristol, nuget, pinta, snakeyaml, taglib-sharp, tangerine, themonospot, tomboy-latex, widemargin, wordpress, xsddiagram, xsp, zeitgeist-sharp. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues, but not all of them: Patches submitted which have not made their way to the archive yet: reproducible.debian.net Packages in experimental are now tested on armhf. (h01ger) Arch Linux packages in the multilib and community repositories (4,000 more source packages) are also being tested. All of these test results are better analyzed and nicely displayed together with each package. (h01ger) For Fedora, build jobs can now run in parallel. Two are currently running, now testing reproducibility of 785 source packages from Fedora 23. mock/1.2.3-1.1 has been uploaded to experimental to better build RPMs. (h01ger) Work has started on having automatic build node pools to maximize use of armhf build nodes. (Vagrant Cascadian) diffoscope development Version 43 has been released on December 15th. It has been dubbed as epic! as it contains many contributions that were written around the summit in Athens. Baptiste Daroussin found that running diffoscope on some Tar archives could overwrite arbitrary files. This has been fixed by using libarchive instead of Python internal Tar library and adding a sanity check for destination paths. In any cases, until proper sandboxing is implemented, don't run diffosope on unstrusted inputs outside an isolated, throw-away system. Mike Hommey identified that the CBFS comparator would needlessly waste time scanning big files. It will now not consider any files bigger than 24 MiB 8 MiB more than the largest ROM created by coreboot at this time. An encoding issue related to Zip files has also been fixed. (Lunar) New comparators have been added: Android dex files (Reiner Herrmann), filesystem images using libguestfs (Reiner Herrmann), icons and JPEG images using libcaca (Chris Lamb), and OS X binaries (Clemens Lang). The comparator for Free Pascal Compilation Unit will now only be used when the unit version matches the compiler one. (Levente Polyak) A new multi-file HTML output with on-demand loading of long diffs is available through the --html-dir option. On-demand loading requires jQuery which path can be specified through the --jquery option. The diffs can also be simply browsed for non-JavaScript users or when jQuery is not available. (Joachim Breitner) Example of on-demand loading in diffosope Portability toward other systems has been improved: old versions of GNU diff are now supported (Mike McQuaid), suggestion of the appropriate locale is now the more generic en_US.UTF-8 (Ed Maste), the --list-tools option can now support multiple systems (Mattia Rizzolo, Levente Polyak, Lunar). Many internal changes and code clean-ups have been made, paving the way for parallel processing. (Lunar) Version 44 was released on December 18th fixing an issue affecting .deb lacking a md5sums file introduced in a previous refactoring (Lunar). Support has been added for Mozilla optimized Zip files. (Mike Hommey). The HTML output has been optimized in size (Mike Hommey, Esa Peuha, Lunar), speed (Lunar), and will now properly number lines (Mike Hommey). A message will always be displayed when lines are ignored at the end of a diff (Lunar). For portability and consistency, Python os.walk() function is now used instead of find to perform directory listing. (Lunar) Documentation update Package reviews 143 reviews have been removed, 69 added and 22 updated in the previous week. Chris Lamb reported 12 new FTBFS issues. News issues identified this week: random_order_in_init_py_generated_by_python-genpy, timestamps_in_copyright_added_by_perl_dist_zilla, random_contents_in_dat_files_generated_by_chasen-dictutils_makemat, timestamps_in_documentation_generated_by_pandoc. Chris West did some improvements on the scripts used to manage notes in the misc repository. Misc. Accounts of the reproducible builds summit in Athens were written by Thomas Klausner from NetBSD and Hans-Christoph Steiner from The Guardian Project. Some openSUSE developers are working on a hackweek on reproducible builds which was discussed on the opensuse-packaging mailing-list.

18 December 2015

Joachim Breitner: A multitude of early Christmas presents

Today was a nice day with a surprising number of early Christmas presents:

11 December 2015

Lunar: Reproducible builds: week 32 in Stretch cycle

The first reproducible world summit was held in Athens, Greece, from December 1st-3rd with the support of the Linux Foundation, the Open Tech Fund, and Google. Faidon Liambotis has been an amazing help to sort out all local details. People at ImpactHub Athens have been perfect hosts. North of Athens from the Acropolis with ImpactHub in the center Nearly 40 participants from 14 different free software project had very busy days sharing knowledge, building understanding, and producing actual patches. Anyone interested in cross project discussions should join the rb-general mailing-list. What follows focuses mostly on what happened for Debian this previous week. A more detailed report about the summit will follow soon. You can also read the ones from Joachim Breitner from Debian, Clemens Lang from MacPorts, Georg Koppen from Tor, Dhiru Kholia from Fedora, and Ludovic Court s wrote one for Guix and for the GNU project. The Acropolis from  Infrastructure Several discussions at the meeting helped refine a shared understanding of what kind of information should be recorded on a build, and how they could be used. Daniel Kahn Gillmor sent a detailed update on how .buildinfo files should become part of the Debian archive. Some key changes compared to what we had in mind at DebConf15: Hopefully, ftpmasters will be able to comment on the updated proposal soon. Packages fixed The following packages have become reproducible due to changes in their build dependencies: fades, triplane, caml-crush, globus-authz. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues, but not all of them: Patches submitted which have not made their way to the archive yet: akira sent proposals on how to make bash reproducible. Alexander Couzens submitted a patch upstream to add support for SOURCE_DATE_EPOCH in grub image generator (#787795). reproducible.debian.net An issue with some armhf build nodes was tracked down to a bad interaction between uname26 personality and new glibc (Vagrant Cascadian). A Debian package was created for koji, the RPM building and tracking system used by Fedora amongst others. It is currently waiting for review in the NEW queue. (Ximin Luo, Marek Marczykowski-G recki) diffoscope development diffoscope now has a dedicated mailing list to better accommodate its growing user and developer base. Going through diffoscope's guts together enabled several new contributors. Baptiste Daroussin, Ed Maste, Clemens Lang, Mike McQuaid, Joachim Breitner all contributed their first patches to improve portability or add new features. Regular contributors Chris Lamb, Reiner Herrmann, and Levente Polyak also submitted improvements. diffoscope hacking session in Athens The next release should support more operating systems, filesystem image comparison via libguestfs, HTML reports with on-demand loading, and parallel processing for the most noticeable improvements. Package reviews 27 reviews have been removed, 17 added and 14 updated in the previous week. Chris Lamb and Val Lorentz filed 4 new FTBFS reports. Misc. Baptiste Daroussin has started to implement support for SOURCE_DATE_EPOCH in FreeBSD in libpkg and the ports tree. Thanks Joachim Breitner and h01ger for the pictures.

4 December 2015

Joachim Breitner: HaL 10

Today I had the honour to hold my first invited talk: I was the keynote speaker at the tenth Haskell in Leipzig workshop. I chose to present how I used the MonadFix type class to conveniently and by-construction-correctly calculate offsets while serialising to a binary data format in the implementation of tttool. Johannes Waldmann could not resist to point out that MonadFix was already a topic at the very first HaL meeting nine years ago, but nevertheless I think that my live-coded slide-less talk was quite suitable for the given audience of roughly 55 attendees. I have uploaded a transcript of this talk.

Joachim Breitner: Reproducible Builds World Summit

This week, I was attending the first Reproducible Builds World Summit in Athens. A while ago, I have fixed some reproducibility bugs in the Haskell compiler GHC (parts of #4012), and that got me a ticket to this fully sponsored and nicely organised event. This was the first event I attended that had a dedicated and professional moderator, Gunner from Aspiration, who came up with a suitable on-the-fly-schedule, ensured the time was structured by shorter plenary rounds with all ~40 attendees as well as longer working sessions in groups of usually at most 10 people, motivated people to run ( facilitate ) these working groups as well as ensured that any results were noted and collected. This definitely makes a difference to the otherwise unstructured let s all sit in a big circle and endlessly discuss things or let s all just hack on what we want to hack on right now structure of less professionally organised meetings. But after 1 days of talking, a certain desire to hack and get work done could be felt throughout the group, so most of us were happy to be allowed to make use of the laptop on Wednesday afternoon. At least as far as possible the wireless was not expecting this crowd, it seems. Besides the few patches above, my involvement in the project is rather rudimentary, so I tried to contribute as far as possible during the event itself. This was not always easy, as many of the working sessions were either self-reflecting or forward-facing: What problems do we see? What should we get done next? How do we talk to users/upstream/press/doners? Where possible, I tried to at least be helpful in the discussion. During the hacking sessions, I found it most useful to contribute to diffoscope, a program to recursively and deeply show the difference between files of any format, so there are now patches to implement a multi-file HTML output, to more efficiently include large diffs, and more importantly I assisted Lunar in making diffoscope multi-threaded. In Python this is not a very great experience; I guess I am spoiled by Haskell (as it is often the case). I enjoyed the group; a bit like DebConf but with many new faced from similarly-minded, but different projects. It was, however, again a Free Software event with an embarrassing low number of female participants. I liked discussing a proof of G del s first incompleteness theorem during a skill exchange session. Unfortunately, I had to leave very timely after the event on Thursday evening, in order to attend the Haskell in Leipzig workshop.

31 October 2015

Joey Hess: a tiling region manager for the console

Building on top of concurrent-output, and some related work Joachim Breitner did earlier, I now have a kind of equivilant to a tiling window manager, except it's managing regions of the console for different parts of a single program. Here's a really silly demo, in an animated gif: demo2.gif Not bad for 23 lines of code, is that? Seems much less tedious to do things this way than using ncurses. Even with its panels, ncurses requires you to think about layout of various things on the screen, and many low-level details. This, by contrast, is compositional, just add another region and a thread to update it, and away it goes. So, here's an apt-like download progress display, in 30 lines of code. aptdemo.gif Not only does it have regions which are individual lines of the screen, but those can have sub-regions within them as seen here (and so on). And, log-type messages automatically scroll up above the regions. External programs run by createProcessConcurrent will automatically get their output/errors displayed there, too. What I'm working on now is support for multiline regions, which automatically grow/shrink to fit what's placed in them. The hard part, which I'm putting the finishing touches on, is to accurately work out how large a region is before displaying it, in order to lay it out. Requires parsing ANSI codes amoung other things. STM rules There's so much concurrency, with complicated interrelated data being updated by different threads, that I couldn't have possibly built this without Software Transactional Memory. Rather than a nightmare of locks behind locks behind locks, the result is so well behaved that I'm confident that anyone who needs more control over the region layout, or wants to do funky things can dive into to the STM interface and update the data structures, and nothing will ever deadlock or be inconsistent, and as soon as an update completes, it'll display on-screen. An example of how powerful and beuatiful STM is, here's how the main display thread determines when it needs to refresh the display:
data DisplayChange
        = BufferChange [(StdHandle, OutputBuffer)]
          RegionChange RegionSnapshot
          TerminalResize (Maybe Width)
          EndSignal ()
    ...
                change <- atomically $
                        (RegionChange <$> regionWaiter origsnapshot)
                                 orElse 
                        (RegionChange <$> regionListWaiter origsnapshot)
                                 orElse 
                        (BufferChange <$> outputBufferWaiterSTM waitCompleteLines)
                                 orElse 
                        (TerminalResize <$> waitwidthchange)
                                 orElse 
                        (EndSignal <$> waitTSem endsignal)
                case change of
                        RegionChange snapshot -> do
                ...
                        BufferChange buffers -> do
                ...
                        TerminalResize width -> do
                ...
So, it composes all these STM actions that can wait on various kinds of changes, to get one big action, that waits for all of the above, and builds up a nice sum type to represent what's changed. Another example is that the whole support for sub-regions only involved adding 30 lines of code, all of it using STM, and it worked 100% the first time.
Available in concurrent-output 1.1.0.

17 October 2015

Joey Hess: it's a bird, it's a plane, it's a super monoid for propellor

I've been doing a little bit of dynamically typed programming in Haskell, to improve Propellor's Info type. The result is kind of interesting in a scary way. Info started out as a big record type, containing all the different sorts of metadata that Propellor needed to keep track of. Host IP addresses, DNS entries, ssh public keys, docker image configuration parameters... This got quite out of hand. Info needed to have its hands in everything, even types that should have been private to their module. To fix that, recent versions of Propellor let a single Info contain many different types of values. Look at it one way and it contains DNS entries; look at it another way and it contains ssh public keys, etc. As an migr from lands where you can never know what type of value is in a $foo until you look, this was a scary prospect at first, but I found it's possible to have the benefits of dynamic types and the safety of static types too. The key to doing it is Data.Dynamic. Thanks to Joachim Breitner for suggesting I could use it here. What I arrived at is this type (slightly simplified):
newtype Info = Info [Dynamic]
    deriving (Monoid)
So Info is a monoid, and it holds of a bunch of dynamic values, which could each be of any type at all. Eep! So far, this is utterly scary to me. To tame it, the Info constructor is not exported, and so the only way to create an Info is to start with mempty and use this function:
addInfo :: (IsInfo v, Monoid v) => Info -> v -> Info
addInfo (Info l) v = Info (toDyn v : l)
The important part of that is that only allows adding values that are in the IsInfo type class. That prevents the foot shooting associated with dynamic types, by only allowing use of types that make sense as Info. Otherwise arbitrary Strings etc could be passed to addInfo by accident, and all get concated together, and that would be a total dynamic programming mess. Anything you can add into an Info, you can get back out:
getInfo :: (IsInfo v, Monoid v) => Info -> v
getInfo (Info l) = mconcat (mapMaybe fromDynamic (reverse l))
Only monoids can be stored in Info, so if you ask for a type that an Info doesn't contain, you'll get back mempty. Crucially, IsInfo is an open type class. Any module in Propellor can make a new data type and make it an instance of IsInfo, and then that new data type can be stored in the Info of a Property, and any Host that uses the Property will have that added to its Info, available for later introspection.
For example, this weekend I'm extending Propellor to have controllers: Hosts that are responsible for running Propellor on some other hosts. Useful if you want to run propellor once and have it update the configuration of an entire network of hosts. There can be whole chains of controllers controlling other controllers etc. The problem is, what if host foo has the property controllerFor bar and host bar has the property controllerFor foo? I want to avoid a loop of foo running Propellor on bar, running Propellor on foo, ... To detect such loops, each Host's Info should contain a list of the Hosts it's controlling. Which is not hard to accomplish:
newtype Controlling = Controlled [Host]
    deriving (Typeable, Monoid)
isControlledBy :: Host -> Controlling -> Bool
h  isControlledBy  (Controlled hs) = any (== hostName h) (map hostName hs)
instance IsInfo Controlling where
    propigateInfo _ = True
mkControllingInfo :: Host -> Info
mkControllingInfo controlled = addInfo mempty (Controlled [controlled])
getControlledBy :: Host -> Controlling
getControlledBy = getInfo . hostInfo
isControllerLoop :: Host -> Host -> Bool
isControllerLoop controller controlled = go S.empty controlled
  where
    go checked h
          controller  isControlledBy  c = True
        -- avoid checking loops that have been checked before
          hostName h  S.member  checked = False
          otherwise = any (go (S.insert (hostName h) checked)) l
      where
        c@(Controlled l) = getControlledBy h
This is all internal to the module that needs it; the rest of propellor doesn't need to know that the Info is using used for this. And yet, the necessary information about Hosts is gathered as propellor runs.
So, that's a useful technique. I do wonder if I could somehow make addInfo combine together values in the list that have the same type; as it is the list can get long. And, to show Info, the best I could do was this:
 instance Show Info where
            show (Info l) = "Info " ++ show (map dynTypeRep l)
The resulting long list of the types of vales stored in a host's info is not a useful as it could be. Of course, getInfo can be used to get any particular type of value:
*Main> hostInfo kite
Info [InfoVal System,PrivInfo,PrivInfo,Controlling,DnsInfo,DnsInfo,DnsInfo,AliasesInfo, ...
*Main> getInfo (hostInfo kite) :: AliasesInfo
AliasesInfo (fromList ["downloads.kitenet.net","git.joeyh.name","imap.kitenet.net","nntp.olduse.net" ...
And finally, I keep trying to think of a better name than "Info".

14 October 2015

Joachim Breitner: Constructing a list in a monad revisited

Two years ago, I discussed various ways of constructing a list in a Monad (or, more specifically, in IO) in Haskell, and compared their merits in terms of elegance, stack usage, number of traversals of the list and run-time efficiency. Recently, two blog posts discussed the issue further and proposed new, more daring alternatives. Neil Mitchell breaks through the abstraction provided by IO, duplicates the world and traverses the list twice, and obtains a speed-up for long lists. Twarn van Laarhoven went even further and wrote custom C-- code to destructively update the tail-pointer of the list cell to be able to create the list completely evaluated on the first start. This basically answers my question from two years ago:
I m still wondering: What would be required from Haskell, GHC or the monads in question to have a fully satisfactory solution here, i.e. one that is as fast as the naive recursion, as stack efficient as the difference list solution, and allocates no thunks, only list cells?
He also has a variant with a slightly nicer interface around holes , i.e. explicit objects on the heap that can later be replaced by indirections. Obviously, both approaches are very unsafe. I took this as an opportunity to redo my benchmark measurements, and include their variants (named escapeIO, hackIO and holeIO). The following table omits the variants with quadratic performance, as I ran it on longer lists now:
Variant 10^0 10^1 10^2 10^3 10^4 10^5 10^6
accumReverse 37ns 153ns 1134ns 12 s 208 s 8540 s 97ms
recursion 29ns 139ns 680ns 6790ns 160 s 6441 s 76ms
replicateM 26ns 126ns 677ns 6785ns 168 s 6314 s 78ms
accumDList 35ns 165ns 995ns 10 s 190 s 9706 s 100ms
streams 27ns 136ns 691ns 6788ns 173 s 5771 s 75ms
unsafeInterleave 60ns 329ns 2804ns 28 s 373 s 5605 s 57ms
listFix 51ns 412ns 4109ns 56 s 2761 s 42ms 445ms
escapeIO 41ns 187ns 1808ns 16 s 234 s 4409 s 45ms
hackIO 30ns 152ns 1199ns 11 s 140 s 3701 s 42ms
holeIO 40ns 222ns 1725ns 17 s 218 s 4446 s 53ms
The following graph shows that around 10000, the naive approaches become much slower and the fancy hacks pay of, with Twarn s tail-pointer-updating code performing the best: I would really like to see a package that provides a API like Twarn s holes, either in this raw unsafe variant (but with the garbage collector related code checked), or with a safe API using type hackery similar to the ST monad that ensures that after normal code gets its hands on a term possibly involving holes, the holes may no longer be modified. I have put the code and results on GitHub.

12 October 2015

Joachim Breitner: Incredible Proof Machine put to the test

I m currently on the way home, returning from a four-day workshop for high school students; ages 13 to 20, grade 9 to 13. My goal was to teach them something about proofs and logic, in a ambitiously formal and abstract way. For this workshop, I created a computer program called The Incredible Proof Machine that allows the kids to explore creating a proof, by placing blocks on a canvas and drawing connections between them (see my previous blog post for an introduction). Subtracting time spent on on breaks, organizational stuff, the time the student needed to prepare a final presentation, I think we were working for a total of 14 hours, during which we covered propositional logic. I generally let the students explore and try to solve the tasks of one session mostly on their own, followed by a common discussion of what they have just done, what it means why it makes sense etc. The sessions were: assumptions and conclusions in general, with conjunction; implication; disjunction; falsum and tertium non datur. We also briefly discussed paper proofs with the student: how they look, and how they relate to the Proof Machine proofs. We had some lecture notes that we handed out pice-wise after each session. The sections were mildly1 password-protected to avoid that the quicker students would work ahead, thus keeping the group together. One or two of the 13 students were doing very well and eventually, I gave them the passwords to the predicate logic section and let them work them on them on their own. The quickest managed to solve almost all of these as well, but (as far I as I can tell) without a deeper understanding of the quantifiers, and more a mechanical intuition. As expected, the students were able to solve most of the exercises, even when a proper understanding of the logical semantics was not yet fully developed. This was by design: I believe that this way it was more motivating and encouraging, as they could make it work , compared to a more traditional approach of first throwing a lot of theory at them and then expecting them to apply it. This was confirmed by their feedback after the workshop. I was happy with my implementation. The students immediately could work with it with very few hick-ups, and only one minor bug occurred2, which I could fix on the spot. Having everything run on in the browser was a big plus, given that we had no stable internet connection for everyone: Once the Incredible Proof Machine was loaded, the student could continue to work offline. Personally, I find that the UI is occasionally too sluggish, especially on the weaker laptops, but it seems that the students did not seem to mind. Some students tried to connect outputs with outputs or inputs with inputs and the visualization did make it clearly visible that such a link is not actually connected to the block. The predicate logic part is a bit less convincing, with e.g. scoping of local constants not easily understood. I would say that this part would work better if some explanation is given before the students start working on the more involved rules. Our room was equipped with a smartboard, and I was delighted when I found out, mostly by accident, that I could actually use my finger to drag the blocks of the proof and to draw connections. This not only gave me a Minority Report -like feeling, but also meant that it was much easier for the students to follow my explanations when they could just watch my hand, instead of trying to locate the mouse pointer on a regular projector. I m generally doubtful whether such fancy technological toys are useful in the class room, but in this case at least I liked it. The ability to scribble anywhere on the screen was occasionally a plus as well. All in all I believe the Proof Machine was a useful tool, and I am sure that without it, it would have been tricky to have students voluntarily spend 14 hours on such a relatively dry and abstract topic. Given the amount of work that went into the development, I hope that this will not be the last occasion where it is put to good use. So if you have to teach formal logic and natural deduction-style proofs, you are welcome to use the Incredible Proof Machine to get your students excited. It is purely static, i.e. needs no special server-side infrastructure, and you can define your own logic (i.e. proof rules), sessions and tasks. Also, there are a few interesting way in which the Proof Machine could be extended. In particular, I d like it to be able to generate a normal , natural-language proof from a given proof even if it will sound a bit mechanical and then use hover-highlight effects to relate the formulas and sentences in the text proof to the connections and blocks in the graphical proof. Contributions are welcome!

  1. JavaScript based with no crypto or even obfuscation in the code.
  2. Due to differences between the task specification as given in the code, and the task specification when normalizing the syntax of the terms, the proofs that the students created were lost when they re-loaded the page.

24 September 2015

Joachim Breitner: The Incredible Proof Machine

In a few weeks, I will have the opportunity to offer a weekend workshop to selected and motivated high school students1 to a topic of my choice. My idea is to tell them something about logic, proofs, and the joy of searching and finding proofs, and the gratification of irrevocable truths. While proving things on paper is already quite nice, it is much more fun to use an interactive theorem prover, such as Isabelle, Coq or Agda: You get immediate feedback, you can experiment and play around if you are stuck, and you get lots of small successes. Someone2 once called interactive theorem proving the worlds most geekiest videogame . Unfortunately, I don t think one can get high school students without any prior knowledge in logic, or programming, or fancy mathematical symbols, to do something meaningful with a system like Isabelle, so I need something that is (much) easier to use. I always had this idea in the back of my head that proving is not so much about writing text (as in normally written proofs) or programs (as in Agda) or labeled statements (as in Hilbert-style proofs), but rather something involving facts that I have proven so far floating around freely, and way to combine these facts to new facts, without the need to name them, or put them in a particular order or sequence. In a way, I m looking for labVIEW wrestled through the Curry-Horward-isomorphism. Something like this:
A proof of implication currying

A proof of implication currying

So I set out, rounded up a few contributors (Thanks!), implemented this, and now I proudly present: The Incredible Proof Machine3 This interactive theorem prover allows you to do perform proofs purely by dragging blocks (representing proof steps) onto the paper and connecting them properly. There is no need to learn syntax, and hence no frustration about getting that wrong. Furthermore, it comes with a number of example tasks to experiment with, so you can simply see it as a challenging computer came and work through them one by one, learning something about the logical connectives and how they work as you go. For the actual workshop, my plan is to let the students first try to solve the tasks of one session on their own, let them draw their own conclusions and come up with an idea of what they just did, and then deliver an explanation of the logical meaning of what they did. The implementation is heavily influenced by Isabelle: The software does not know anything about, say, conjunction ( ) and implication ( ). To the core, everything is but an untyped lambda expression, and when two blocks are connected, it does unification4 of the proposition present on either side. This general framework is then instantiated by specifying the basic rules (or axioms) in a descriptive manner. It is quite feasible to implement other logics or formal systems on top of this as well. Another influence of Isabelle is the non-linear editing: You neither have to create the proof in a particular order nor have to manually manage a proof focus . Instead, you can edit any bit of the proof at any time, and the system checks all of it continuously. As always, I am keen on feedback. Also, if you want to use this for your own teaching or experimenting needs, let me know. We have a mailing list for the project, the code is on GitHub, where you can also file bug reports and feature requests. Contributions are welcome! All aspects of the logic are implemented in Haskell and compiled to JavaScript using GHCJS, the UI is plain hand-written and messy JavaScript code, using JointJS to handle the graph interaction. Obviously, there is still plenty that can be done to improve the machine. In particular, the ability to create your own proof blocks, such as proof by contradiction, prove them to be valid and then use them in further proofs, is currently being worked on. And while the page will store your current progress, including all proofs you create, in your browser, it needs better ways to save, load and share tasks, blocks and proofs. Also, we d like to add some gamification, i.e. achievements ( First proof by contradiction , 50 theorems proven ), statistics, maybe a share theorem on twitter button. As the UI becomes more complicated, I d like to investigating moving more of it into Haskell world and use Functional Reactive Programming, i.e. Ryan Trickle s reflex, to stay sane. Customers who liked The Incredible Proof Machine might also like these artifacts, that I found while looking whether something like this exists:

  1. Students with migration background supported by the START scholarship
  2. Does anyone know the reference?
  3. We almost named it Proofcraft , which would be a name our current Minecraft-wild youth would appreciate, but it is alreay taken by Gerwin Kleins blog. Also, the irony of a theorem prover being in-credible is worth something.
  4. Luckily, two decades ago, Tobias Nipkow published a nice implementation of higher order pattern unification as ML code, which I transliterated to Haskell for this project.

22 August 2015

Joachim Breitner: Quickest path to a local apt repository

As I m writing this, DebConf 15 is coming to an end. I spend most of my time improving the situation of the Haskell Packages in Debian, by improving the tooling and upgrading our packages to match Stackage 3.0 and build against GHC 7.10. But that is mostly of special interest (see this mail for a partial summary), so I d like to use this post to advertise a very small and simple package I just uploaded to Debian: During one of the discussion here I noticed that it is rather tricky to make a locally built package available to apt-get. The latest version in unstable allows one to install a debian package simply by running apt-get install on it, but in some cases, e.g. when you want a convenient way to list all packages that you made available for local use, this is insufficient. So the usual approach is to create a local apt repository with your packages. Which is non-trivial: You can use dpkg-scanpackage, apt-ftparchive or reprepro. You need to create the directories, run the commands, add the repository to your local sources. You need to worry about signing it or setting the right options to make apt-get accept it without signing. It is precisely this work that my new package local-apt-repository automates for you: Once it is installed, you simply drop the .deb file into /srv/local-apt-repository/ and after the next apt-get update the package can be installed like any other package from the archive. I chose to use the advanced features that systemd provides namely activation upon path changes so works best with systemd as the init system. If you want to contribute, or test it before it passes the NEW queue, check out the git repository.

3 August 2015

Lunar: Reproducible builds: week 14 in Stretch cycle

What happened in the reproducible builds effort this week: Toolchain fixes akira submitted a patch to make cdbs export SOURCE_DATE_EPOCH. She uploded a package with the enhancement to the experimental reproducible repository. Packages fixed The following 15 packages became reproducible due to changes in their build dependencies: dracut, editorconfig-core, elasticsearch, fish, libftdi1, liblouisxml, mk-configure, nanoc, octave-bim, octave-data-smoothing, octave-financial, octave-ga, octave-missing-functions, octave-secs1d, octave-splines, valgrind. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: In contrib, Dmitry Smirnov improved libdvd-pkg with 1.3.99-1-1. Patches submitted which have not made their way to the archive yet: reproducible.debian.net Four armhf build hosts were provided by Vagrant Cascadian and have been configured to be used by jenkins.debian.net. Work on including armhf builds in the reproducible.debian.net webpages has begun. So far the repository comparison page just shows us which armhf binary packages are currently missing in our repo. (h01ger) The scheduler has been changed to re-schedule more packages from stretch than sid, as the gcc5 transition has started This mostly affects build log age. (h01ger) A new depwait status has been introduced for packages which can't be built because of missing build dependencies. (Mattia Rizzolo) debbindiff development Finally, on August 31st, Lunar released debbindiff 27 containing a complete overhaul of the code for the comparison stage. The new architecture is more versatile and extensible while minimizing code duplication. libarchive is now used to handle cpio archives and iso9660 images through the newly packaged python-libarchive-c. This should also help support a couple other archive formats in the future. Symlinks and devices are now properly compared. Text files are compared as Unicode after being decoded, and encoding differences are reported. Support for Sqlite3 and Mono/.NET executables has been added. Thanks to Valentin Lorentz, the test suite should now run on more systems. A small defiency in unquashfs has been identified in the process. A long standing optimization is now performed on Debian package: based on the content of the md5sums control file, we skip comparing files with matching hashes. This makes debbindiff usable on packages with many files. Fuzzy-matching is now performed for files in the same container (like a tarball) to handle renames. Also, for Debian .changes, listed files are now compared without looking the embedded version number. This makes debbindiff a lot more useful when comparing different versions of the same package. Based on the rearchitecturing work has been done to allow parallel processing. The branch now seems to work most of the time. More test needs to be done before it can be merged. The current fuzzy-matching algorithm, ssdeep, has showed disappointing results. One important use case is being able to properly compare debug symbols. Their path is made using the Build ID. As this identifier is made with a checksum of the binary content, finding things like CPP macros is much easier when a diff of the debug symbols is available. Good news is that TLSH, another fuzzy-matching algorithm, has been tested with much better results. A package is waiting in NEW and the code is ready for it to become available. A follow-up release 28 was made on August 2nd fixing content label used for gzip2, bzip2 and xz files and an error on text files only differing in their encoding. It also contains a small code improvement on how comments on Difference object are handled. This is the last release name debbindiff. A new name has been chosen to better reflect that it is not a Debian specific tool. Stay tuned! Documentation update Valentin Lorentz updated the patch submission template to suggest to write the kind of issue in the bug subject. Small progress have been made on the Reproducible Builds HOWTO while preparing the related CCCamp15 talk. Package reviews 235 obsolete reviews have been removed, 47 added and 113 updated this week. 42 reports for packages failing to build from source have been made by Chris West (Faux). New issue added this week: haskell_devscripts_locale_substvars. Misc. Valentin Lorentz wrote a script to report packages tested as unreproducible installed on a system. We encourage everyone to run it on their systems and give feedback!

9 July 2015

Simon Kainz: DUCK challenge: week 1

After announcing the DUCK challenge last week the following packages were fixed and uploaded into unstable: A big "Thank You" to you. The list of the fixed and updated packages is availabe here. I will try to update this ~daily. If I missed one of your uploads, please drop me a line. There is still lots of time till the end of DebConf15 and the end of the DUCK Challenge, so please get involved.

22 June 2015

Lunar: Reproducible builds: week 8 in Stretch cycle

What happened about the reproducible builds effort this week: Toolchain fixes Andreas Henriksson has improved Johannes Schauer initial patch for pbuilder adding support for build profiles. Packages fixed The following 12 packages became reproducible due to changes in their build dependencies: collabtive, eric, file-rc, form-history-control, freehep-chartableconverter-plugin , jenkins-winstone, junit, librelaxng-datatype-java, libwildmagic, lightbeam, puppet-lint, tabble. The following packages became reproducible after getting fixed: Some uploads fixed some reproducibility issues but not all of them: Patches submitted which have not made their way to the archive yet: reproducible.debian.net Bugs with the ftbfs usertag are now visible on the bug graphs. This explain the recent spike. (h01ger) Andreas Beckmann suggested a way to test building packages using the funny paths that one can get when they contain the full Debian package version string. debbindiff development Lunar started an important refactoring introducing abstactions for containers and files in order to make file type identification more flexible, enabling fuzzy matching, and allowing parallel processing. Documentation update Ximin Luo detailed the proposal to standardize environment variables to pass a reference source date to tools that needs one (e.g. documentation generator). Package reviews 41 obsolete reviews have been removed, 168 added and 36 updated this week. Some more issues affecting packages failing to build from source have been identified. Meetings Minutes have been posted for Tuesday June 16th meeting. The next meeting is scheduled Tuesday June 23rd at 17:00 UTC. Presentations Lunar presented the project in French during Pas Sage en Seine in Paris. Video and slides are available.

20 June 2015

Joachim Breitner: Running circle-packing in the Browser, now using GHCJS

Quite a while ago, I wrote a small Haskell library called circle-packing to pack circles in a tight arrangement. Back then, I used the Haskell to JavaScript compiler fay to create a pretty online demo of that library, and shortly after, I create the identical demo using haste (another Haskell to JavaScript compiler). The main competitor of these two compilers, and the most promising one, is GHCJS. Back then, it was too annoying to install. But after two years, things have changed, and it only takes a few simple commands to get GHCJS running, so I finally created the circle packing demo in a GHCJS variant. Quick summary: Cabal integration is very good (like haste, but unline fay), interfacing JavaScript is nice and easy (like fay, but unlike haste), and a quick check seems to indicate that it is faster than either of these two. I should note that I did not update the other two demos, so they represent the state of fay and haste back then, respectively. With GHCJS now available at my fingertips, maybe I will produce some more Haskell to be run in your browser. For example, I could port FrakView, a GUI program to render, expore and explain iterated function systems, from GTK to HTML.

Next.

Previous.